Large-language models, as used in ChatGPT, have been one of the defining areas of the 'new AI'. At their simplest they are word predictors, taking a text and attempting to predict the next word that will appear. If this is repeated then whole new texts can be created. LLMs build on the success of simple statistical methods such as n-grams which, when trained on very large corpera, were found to be 'unreasonably effective' at tasks that had previously been thought to require complex natural language processing. LLMs leverage the same big data, from web documents, media feeds, social media and forums, but use deep neural networks, which appear able to identify higher levels of meaning such as topics. The addition of attention mechanisms in transformer models has allowed LLMs to make use of long-term patterns in language, such as understanding the referants of pronouns, or returing to previous topics. The text and chat's produced by LLMs can be indistinguishable form those of humans and can thus be argued to be passing the Turing test.
Used on pages 10, 289, 313, 314, 545, 556, 557, 568, 574, 582, 584
Also known as LLM
Links:
- arXiv: article: GPT-4 Technical Report. OpenAI (2023).
- doi.ieeecomputersociety.org: article: The unreasonable effectiveness of data. A. Halevy, P. Norvig, and F. Pereira (2009).